5,304 research outputs found
State Estimation for the Individual and the Population in Mean Field Control with Application to Demand Dispatch
This paper concerns state estimation problems in a mean field control
setting. In a finite population model, the goal is to estimate the joint
distribution of the population state and the state of a typical individual. The
observation equations are a noisy measurement of the population.
The general results are applied to demand dispatch for regulation of the
power grid, based on randomized local control algorithms. In prior work by the
authors it has been shown that local control can be carefully designed so that
the aggregate of loads behaves as a controllable resource with accuracy
matching or exceeding traditional sources of frequency regulation. The
operational cost is nearly zero in many cases.
The information exchange between grid and load is minimal, but it is assumed
in the overall control architecture that the aggregate power consumption of
loads is available to the grid operator. It is shown that the Kalman filter can
be constructed to reduce these communication requirements,Comment: To appear, IEEE Trans. Auto. Control. Preliminary version appeared in
the 54rd IEEE Conference on Decision and Control, 201
Individual risk in mean-field control models for decentralized control, with application to automated demand response
Flexibility of energy consumption can be harnessed for the purposes of
ancillary services in a large power grid. In prior work by the authors a
randomized control architecture is introduced for individual loads for this
purpose. In examples it is shown that the control architecture can be designed
so that control of the loads is easy at the grid level: Tracking of a balancing
authority reference signal is possible, while ensuring that the quality of
service (QoS) for each load is acceptable on average. The analysis was based on
a mean field limit (as the number of loads approaches infinity), combined with
an LTI-system approximation of the aggregate nonlinear model. This paper
examines in depth the issue of individual risk in these systems. The main
contributions of the paper are of two kinds:
Risk is modeled and quantified:
(i) The average performance is not an adequate measure of success. It is
found empirically that a histogram of QoS is approximately Gaussian, and
consequently each load will eventually receive poor service.
(ii) The variance can be estimated from a refinement of the LTI model that
includes a white-noise disturbance; variance is a function of the randomized
policy, as well as the power spectral density of the reference signal.
Additional local control can eliminate risk:
(iii) The histogram of QoS is truncated through this local control, so that
strict bounds on service quality are guaranteed.
(iv) This has insignificant impact on the grid-level performance, beyond a
modest reduction in capacity of ancillary service.Comment: Publication without appendix to appear in the 53rd IEEE Conf. on
Decision and Control, December, 201
Beyond the Big Leave: The Future of U.S. Automotive Human Resources
Based on industry interviews and trends analyses, forecasts employment levels and hiring nationwide and in Michigan through 2016, and compiles automakers' input on technical needs, hiring criteria, and suggestions for training and education curricula
Zap Q-Learning for Optimal Stopping Time Problems
The objective in this paper is to obtain fast converging reinforcement
learning algorithms to approximate solutions to the problem of discounted cost
optimal stopping in an irreducible, uniformly ergodic Markov chain, evolving on
a compact subset of . We build on the dynamic programming
approach taken by Tsitsikilis and Van Roy, wherein they propose a Q-learning
algorithm to estimate the optimal state-action value function, which then
defines an optimal stopping rule. We provide insights as to why the convergence
rate of this algorithm can be slow, and propose a fast-converging alternative,
the "Zap-Q-learning" algorithm, designed to achieve optimal rate of
convergence. For the first time, we prove the convergence of the Zap-Q-learning
algorithm under the assumption of linear function approximation setting. We use
ODE analysis for the proof, and the optimal asymptotic variance property of the
algorithm is reflected via fast convergence in a finance example
- …